Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition

نویسندگان

چکیده

Abstract In this work, we examine the ability of NER models to use contextual information when predicting type an ambiguous entity. We introduce NRB, a new testbed carefully designed diagnose Name Regularity Bias models. Our results indicate that all state-of-the-art tested show such bias; BERT fine-tuned significantly outperforming feature-based (LSTM-CRF) ones on despite having comparable (sometimes lower) performance standard benchmarks. To mitigate bias, propose novel model-agnostic training method adds learnable adversarial noise some entity mentions, thus enforcing focus more strongly signal, leading significant gains NRB. Combining it with two other strategies, data augmentation and parameter freezing, leads further gains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Using Corpus-derived Name Lists for Named Entity Recognition

This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. Names in text are then identi ed using only these lists. This approach does not perform as well as state-of-the-art named entity recognition systems. However, we then show that by using simple ltering techniques for imp...

متن کامل

Domain-aware Evaluation of Named Entity Recognition Systems for Croatian

We provide an evaluation of the currently available named entity recognition systems for Croatian. The evaluation puts special emphasis on domain dependence. To this goal, we manually annotated a dataset of approximately 1 million tokens of Croatian text from various domains within the newspaper text genre. The dataset was annotated using a three-class named entity tagset – denoting personal na...

متن کامل

Exploiting Dependency Context Gazetteers for Named Entity Recognition

Modern named entity recognition (NER) systems mostly employ a supervised machine learning approach that heavily depends on local contexts. While NER systems based on local contexts provide strong baseline performance, results of recent research have demonstrated that non-local contexts can further improve the performance of these systems. In this paper, we propose the use of a context gazetteer...

متن کامل

An Active Co-Training Algorithm for Biomedical Named-Entity Recognition

Exploiting unlabeled text data with a relatively small labeled corpus has been an active and challenging research topic in text mining, due to the recent growth of the amount of biomedical literature. Biomedical named-entity recognition is an essential prerequisite task before effective text mining of biomedical literature can begin. This paper proposes an Active Co-Training (ACT) algorithm for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2021

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00386